Maximising Audiovisual Correlation with Automatic Lip Tracking and Vowel Based Segmentation

نویسندگان

  • Andrew Abel
  • Amir Hussain
  • Quoc Dinh Nguyen
  • Fabien Ringeval
  • Mohamed Chetouani
  • Maurice Milgram
چکیده

In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM) approach developed by the authors is used for automatic lip tracking, and an adapted version of our vowel based speech segmentation system is employed to automatically segment speech. Canonical Correlation Analysis (CCA) on segmented and non segmented data in a range of noisy speech environments finds that segmented speech has a significantly better audiovisual correlation, demonstrating the feasibility of our techniques for further development as part of a proposed audiovisual speech enhancement system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lip movements affect infants' audiovisual speech perception.

Speech is robustly audiovisual from early in infancy. Here we show that audiovisual speech perception in 4.5-month-old infants is influenced by sensorimotor information related to the lip movements they make while chewing or sucking. Experiment 1 consisted of a classic audiovisual matching procedure, in which two simultaneously displayed talking faces (visual [i] and [u]) were presented with a ...

متن کامل

Automatic and Accurate Lip Tracking

Lip segmentation is an essential stage in many multimedia systems such as videoconferencing, lip reading, or low bit rate coding communication systems. In this paper, we propose an accurate and robust quasi automatic lip segmentation algorithm. First, the upper mouth boundary and several characteristic points are detected in the fi rst frame by using a new kind of active contour : the “jumping ...

متن کامل

Audiovisual vowel perception Audiovisual perception of openness and lip rounding in front vowels

Swedish nonsense syllables /ɡiɡ/, /ɡyɡ/, /ɡeɡ/ and /ɡøɡ/, produced by four speakers, were video-recorded and presented to male and female subjects in auditory, visual and audiovisual mode and also in cross-dubbed audiovisual form with incongruent cues to vowel openness, roundedness, or both. With audiovisual stimuli, subjects perceived openness nearly always by ear. Most subjects perceived roun...

متن کامل

Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications

Developments in dynamic contour tracking permit sparse representation of the outlines of moving contours. Given the increasing computing power of general-purpose workstations it is now possible to track human faces and parts of faces in real-time without special hardware. This paper describes a real-time lip tracker that uses a Kalman lter based dynamic contour to track the outline of the lips....

متن کامل

Automatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI

Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009